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Abstract 


An overview of the methods of line-by-line and whole-band analysis developed 
by The Ohio State University spectroscopy group is presented. 
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Introduction 

One of the objectives of our group is the collection snd snalysis of laboratory 
spectra of molecules of atmospheric interest. In recent years we have developed efficient 
techniques for extracting the maximum amount of information from these spectra with a 
minimum of observer bias. A description of the approach to the analysis of single lines 
and entire bandsi taken by the members of our group listed in Table 1, is described here. 
More detailed descriptions are given elsewhere (see references). 


Single Line Analysis 

A typical problem in spectral analysis is the determination of line positions. One 

method is to estimate the line center and to determine its position with respect to other 

calibration features by interpolation! as indicated in Fig. 1. This method is often 

adequate but it is tedious if there are many lines to be measured. It is also dlfticult to 

estimate the precision of the results since the judgments used to determine the line 

center and in the interpolation process cannot be quantified. 

Less reliance is placed on the observer if the experimental signals I (v ) are 

exp i 

compared with the corresponding signals I (u ) cf an artificial line centered at v (art) as 

art i o 

shown in Figure 3a. The spectrum of the differences 

A(\>) = I (u) - I (u), 
i exp i art 1 

shown in Pig. 3b, indicates the sensitivity of the data to changes in the line position. This 

sensitivity can also be tested by calculating the partial derivative 31 (v )/3u . A best 

exp 1 1 

estimate of the line position u is obtained by shifting the position v (art) until the sum of 

o o 

the squares of the differences 



SS . £AI (V ) 
i 
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is a minimum as indicated in Fig. 2c. The rate of change of the SS with respect to the 

difference Av = v (exp) - v (art) gives a measure of the pro.cision with which the line 
o o 

center can be found. Unlike the first method, once the model for the artificial line and 
the criteria for establishing the uncertainty of the line position have been defined, no 


other observer Judgment is required. 


Other parameters can also be determined by this method. By generating 


artificial lines I(v , S ) as shown in Fig. 2d, corresponding to lines of different 
i art 

intensities, and minimizing 


SS « £AI (u , S ), 
i art 

where I(v , S ) * I (v ) - I(v , S ), 

i art exp i i art 

estimates of the line intensity and its precision can be obtained. In this method 


(1) The intensity and position estimates, and also estimates of other variables p^ 
which affect the signals I(v^), are obtained from the same set of experimental 
data. 

(2) Estimates of the goodness of fit to I (u ) are obtained from the sums SS. These 

exp i 

minima should not be significantly larger than those due to the noise in the data. 

(3) Good fits are only obtained if the models are accurate. The accuracy of the 
models as well as the parameter estimates p^ can be fudged by examination of the 
residuals spectra AI(v^). 

(4) All the information about each model and its parameters contained in the spectra 
can be extracted since all the data are examined. 

(5) The ultimate precision is dependent on the characteristics of the experimental 
data. Some of these characteristics are described by the partial derivatives 
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exp ) 

(6) All of the parexneters can be eatimxted simtilUneoualy by usixig Appropriate 
least squares regression techniques. 

Niple (8,9) and Tu (10) have considered some of the limitations of the 
experimental data by analyzing simulated spectra of single lines. They have shovm that 
the precisions of the retrieved parameters depend on 

(a) the SNR of the spectrum, 

(b) the number N of data points analyzed, 

o 

(c) the spacing S\> of the data points, 

(d) the number of models required and parameters retrieved, 

(e) the accuracy of the models, 

(f) the magnitudes and shapes of the partial derivatives dl/dp^, and 

(g) the correlations between the parameters p^. 

It is necessary to use a minimum of four models containing at least six adjustable 

parameters to create a set of simulated signals I (u ) suitable for analysis. These are 

exp 1 

briefly described in Table 2. Tu (10) has investigated the information content in spectra 
similar to those in Pig. 3, calculated for a Lorentz line, a triangular ISRF with H - a, a 

4 

background and bai>ellne independent of v, and a SNR of 10 . He has estimated the 
dependence of the fractional uncertainty AS/S in the retrieved line intensity on S as 
shown in Fig. 4 for the case where all six parameters are retrieved. Beca\ise of increasing 
correlations between some of the parameters there is a limited range of S values for 
which satisfactory retrievals can be made. These correlations are related to the 
derivatives 3I(v)/3p shown in Fig. 5. 
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Some improvement in these frectionel uncerteinties is obtained by fixing one or 
more of the parameters, although this may introduce systematic errors if these 
parameters are fixed to incorrect values. It is seen, from Fig. 4, that, at small S values, 
the greatest improvement is obtained by fixing the baseline. 

From these studies quantitative evaluations of experimental designs can be 
made. These have yielded some interesting results in addition to confirming the intuitive 
conclusions obtained by less rigorous analyses of experimental spectra. In particular 

(1) The usable range can be determined. This is the range of experimental 

conditions for which all six parameters of these simple models can be estimated. 

(3) For the cases considered here this range is primarily limited by correlations 
between S, B, a, and H. 

(3) The usable range can be extended by fixing one or more of the adtustable 
parameters. 

(4) The systematic errors introduced by fixing the parameters may be larger than 
the uncertainties of the parameters retrieved by the regression methods used 
here. This is especially likely if the fixed parameter was highly correlated with 
free parameter. 


Whole Band Analysis 

For many atmospheric problems it is sufficient to know the intensity, width, and 


position of a few Unes. However, these data also serve as the raw material from which 



2 . the nvunber of peremeters retrieved le relatively tmall and this allows their 
precision to be increased, 

3. the usable range is larger than that for single lines, because many of the highly 
correlated parameters previously measured are no longer estimated directly, and 

4. the accuracy of the band models can be rigorously tested. 

When the band systems in Fig. 6 were analyzed (5) by this method the residuals 

spectnun shown In the lower part of the figure was obtained. The weak Q branch near 

—1 

2615 L xu and associated P and R branch lines are Just above the noise level and are too 
weak to be modeUed satisfactorily. Even so, the sum of the squares of the differences is 
within 50% of that expected from the noise and this is considered to be a good retrieval. 

This agreement confirms that the line shape, the band models, the ISRF, and 
other instrumental effects have been modeUed sufficiently well to describe the 
experimental data. These data have an estimated SNR of 200-300 and were obtained with 
a Fourier Transform Spectrometer (FTS) having a spectral resolution of about 0.06 cm 
The results of the analysis of this band system have been described by Hoke (5, 6). The 

-i 

center of the principal band was estimated to occur at 2614.241224 ± O.OOOOSS cm , this 
precision is at least an order of magnittide better than is usually claimed from 
measurements made with instruments of similar resolution and performance. 

If the backgrotmd, baseline, and ISRF parameters are retrieved, together vHth 
the band intensity and line width parameters, then the line intensities and widths can be 
estimated to about 2%. In all other reported meastirements of similar quantities these 
Instrumental parameters have been assumed known. When our analyses were repeated 
with these same asstimptions the xmcertainties in the line intensities and widths were 
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more fimdamental molecular properties can be derivedi provided a sufficiently large 

number of lines in a vibration- rotation band can be analyzed. 

The strongest band of CO shown in Fig. 6 contains over 100 lines. If a 

2 

line-by-line analysis on this band were performed by the methods }ust described up to 600 
parameter values would be required. However, the spectrum in Pig. 6 contains three CO 

2 

bands with lines of measurable intensity and thus more than 1000 parameters may be 

required to describe this spectrum. Many of these lines lie outside the usable range 

describe^'’ above and most are overlapped and blended with other iuies. Unless the 

investigator makes arbitrary decisions concerning some of the adjustable parameters in 

these line by line models, only a small fraction of the lines can be analyzed satisfactorily. 

However, the parameters describing these lines are not independent and the 

models for the background and baseline can be assunied to be smoothly varying functions 

over the band. Thus the spectrum in Fig. 6 can be described by models similar to those in 

Table 2 and a series of additional models which relate the parameters associated with the 

individual lines f5, 6). The models describing these additional relationships and the typical 

numbers of adjustable parameters in these models are shown in Table 3. By extending the 

analysis of the previous section to the retrieval of the parameters in this table, the 

problem is reduced to estimating of the order of 40 parameters rather than over 1000. We 

have used this technique to analyze overlapping bands of CO, CO , and N 0 which contain 

2 2 

over 3000 data values and several himdred lines (see Refs. 3- 7). The advantages of this 
approach are 

1. the parameters of primary interest are found directly from the experimental 


data. 


reduced to less than 0.1%. 


oRinr^^i 


OF POOR QUALITY 


Portions of the experimental spectrum, the corresponding calcxilated spectrum, 
and the residuals spectrum obtained during the course of this investigation are shown in 
Fig. 7. Although it is difficult to detect significant differences between the observed and 
calculated spectra the residuals spectrum shows systematic variations attributed to poor 
modelling of the ISRF. These systematic differences are not seen in the final residuals 
spectrum shown in Fig. 8. 

In our analysis of CO, CO and N O bands we have consistently obtained stable, 

2 2 

reproducible solutions, but the size of the data .*‘:ts which can be handled and the number 
of parameters retrieved is limited by the computer resources available. As these 
resotirces expand it is expected that several spectra of the same band and/or several band 
systems will be analyzable simultaneously. The band systems of most molecviles are 
described by more complicated models than those required in these investigations. In 
these cases the whole band methods of analysis car. be powerful tools for testing the 
accuracies of these models and for identifying the significant parameters. 

These analytical methods involve the application of standard statistical 
techniques to the analysis of typical experimental data. However they have advantages 
over the more conventional methods of spectral analysis which are similar to the 
advantages claimed for FTS over dispersive spectroscopic techniques. These are usually 
described as the throughput and multiplex advantages. 

By analogy with these comparisons the multiplex advantage of whole band 
analysis over line-by-line analysis is the direct, simultaneous retrieval of the parameters 
of interest. There is also a corresponding throughput advantage in that the entire data set 


is searched for information about each parameter. 
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Table 2 Single Line Models and Parameters 



Minimum 

Parameters 


Line Shape 
(Lorentz, Voigt, ... ) 


Instrument Spectral 
Response Function 
(ISRF) 

(sine, triangle 
Gaussian, ... ) 


3 Background to 
line (polynomial 
in u and oilier 
terms) 


Baseline to line 
(polynomial in v) 










T«ble 3 Models for Describing Entire Bends 


Quentity 
Line Intensities 

Line Positions 


Line widths 
ISRF 




Model 

Number of Parameters 

Band Intensity 
Sjj » ESaines) 

~3 if interaction terms 
are IncludedUhe .sample 
temperature and 
are also reqtiired) 

y * B' - B" 

For COj the upper and 
lower state energies can 
each be modelled by 
expressions containing 
3 or 4 parameters 

<8 

The va riation in the Lorentz 
widths over the band is 
modelled hv empirical relations. 

<10 

' 1 

The models contain 3-4 
parameters which are assumed 
constant over the band. 

<4 

Several parameters may be 
required if channel spectra 

are present. 

<10 

Usually 2 terms are adequate. 

<3 


Totel nu:.iber of paremeters retrieved 30 - 40. 
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Fig. 1. Line positions by interpolation. The position of the )th line is 

-1 

(2622 * Ax^/Ax^] cm 


Fig. 2. a) The spectrum I Cv> ) of an artificial Un* is compared with the experimental 

art i 

data I (u ). b) the differences Al(u ) are similar in shape to the partial 
exp i i 

derivative dl (u )/6v. c) the best estimate of the line position occurs at the 
exp i 

2 

minimum value of the sum EAI (}>)•, d), e), f) corresponding steps required to 
determine the line intensity. 


Fig. 3. A set of lines generated by using a I.orentz line shape, a triangular ISRF with 

-1 

H = a = 0.1 cm , and constant I and B functions. 

o 


Fig. 4. The dependence of the fractional uncertainty in the retrieved line intensity on 
the line intensity for the case of 6 retrieved parameters and for the cases where 


B, I , and H are fixed to their correct values, 
o 


Fig. 5. The derivatives dl(v )/3p as fimctions of v for the parameters p = S, a, \> , H, I 

i } ) o o 

and B and the dependence of these derivatives on the line intensity S. 


-1 

Fig. 6. Upper crirve: a spectrum of CO near 2600 cm . Lower curve: the residuals 

2 

spectrum after modelling the entire spectrum and retrieving more than 30 


adjustable parameters. 



Pig. 7. The upper portion of this figure shows a comparison between a portion of the 
experimental spectrum in Pig. 6 and a calculated spectrum. The residuals 
spectrum in the lower part of this figure has systematic variations attributed to 
incorrect modelling. 

Pig. 8. The upper part of this figure is the same portion of the experimental spectrum as 
that in Pig. 7. The lower part of this figme is the residuals spectnun obtained by 
using the best estimates for the parameters. 







Simulated Spectrum 
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Fractional Uncertainty in Intensity, AS/S 
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